Skip to content

Conversation

@ThomSerg
Copy link
Collaborator

With all of the work on datasets / benchmarking I've been doing lately, I thought of cleaning up the code and creating some reusable structures to limit code duplication and ease the creation of new datasets / benchmarks.

This pull request is definitely not in a state to be merged, but opening it anyway as a starting point for discussion / to receive feedback. So most things in here are just ideas, to start the conversation on more generic datasets / parsers / (competition) benchmarks. I'm not at all "attached" to any of the code. Just tried to get something to work first, now we can start discussion how it should be done "properly".

cpmpy.tools.dataset

A new dataset module is introduced (cpmpy.tools.dataset) as a central place to collect ... datasets. I could have placed the code directly in here, but as discussed multiple times internally, we have multiple different concepts of datasets. Below a sketch:

image

We have

  • "model"-datasets, files which directly describe a CP model, e.g. XCSP3
  • "problem"-datasets, files which describe data to be used as input for generating a CP model using a "model generator", e.g. psplib

Due to this distinction, I also put the "model"-datasets inside a "model" subdirectory.

3 "model" datasets have been added:

  • MaxSat Eval (mse)
  • XCSP3
  • PB competition

(I have a version of PSPLIB, but this one actually belongs to the "problem"-dataset category.)

Each dataset subclasses the generic _Dataset, which implements logic that should be shared across all datasets and which provides dataset-specific methods to be overwritten. Mostly, each dataset defines its arguments for the constructor (e.g. year, track, ...) and a download method. So quite easy to add new datasets.

Parsers

For each of the datasets, respective parsers have been added to the tools:

  • MSE: from cpmpy.tools.wcnf import read_wcnf
  • XCSP3: from cpmp.tools.xcsp3 import read_xcsp3
  • PB: from cpmpy.tools.opb import read_opb

You'll notice some differences in the names, due to data formats being more generic than datasets, e.g. MSE is formulated in the more generic WCNF format.

cpmpt.tools.benchmark

Whilst running experiments, I've collected many variants of the XCSP3 benchmark runner adapted to the other datasets. So I thought, why not do the same exercise here? This one is still the most uncertain on how it should best be done.

So next to data formats and datasets, we also have "formalised" benchmarks. They're decoupled from both the parser and the dataset. For example, the PB competition. It defines an input format, an output format, and rules on how to "behave" (e.g. how to handle a sigterm). The WCNF parser covers the input part, so that one gets reused. The OPB dataset covers instances to test on, but any other dataset in the WCNF format can also be used within the rules of the PB competition. All the other competition rules get captured in this new "benchmark" object. I again provided a more generic Benchmark to be subclassed, but in this case it is also usable on its own:

from cpmp.tools.wcnf import read_wcnf    # your custom model parser or one included in CPMpy
bm = Benchmark(reader=read_wcnf)
bm.run(
    instance="example.wcnf",     # your benchmark instance (e.g. coming from a CPMpy model dataset)
    solver="ortools",
    time_limit=30,
    mem_limit=1024,
    verbose=True
)

Simply provide a callable parser and a path to an instance, and the model will be create and solved with the niceties of us handling everything (memlimits, timeouts, printing, capturing results, ...). Many more arguments are available (like with the xcsp3 competition):

def run(
        self,
        instance:str,                                          # path to the instance to run
        open:Optional[callable] = None,         # how to 'open' the instance file
        seed: Optional[int] = None,                 # random seed
        time_limit: Optional[int] = None,         # time limit for this single instance
        mem_limit: Optional[int] = None,        # MiB: 1024 * 1024 bytes
        cores: int = 1,                         
        solver: str = None,                                # which backend solver to use
        time_buffer: int = 0,               
        intermediate: bool = False,             
        verbose: bool = False,
        **kwargs,     
    ):

But to follow the PB-competition specific rules, we also have the pre-made OPBBenchmark subclass which customises Benchmark to the rules of the PB-Competition. Since a lot of the Benchmark's behavior has been compartmentalized into different methods, any subclass can easily overwrite these to customise according to the competition rules (e.g. how to format the result, how to report on intermediate results, how to handle sigterms, ...). This subclassing of Benchmark allows for the creation of many competition runners with as little duplicate code as possible.


That's about it. A lot of code that should probably become separate pull requests after we figure out what to do with it.

(tools.xcsp3 still contains a lot of code from before I attempted to bring things together, e.g. its still has its own dataset / benchmark runner)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants